Search CORE

4 research outputs found

Automated Crowdturfing Attacks and Defenses in Online Review Systems

Author: Arisoy Ebru
Fei Geli
Kakhki Arash Molavi
Kim Gyuwan
Lee Kyumin
Lee Kyumin
Li Fangtao
Maas Andrew L.
Maxwell Harper F.
Mukherjee Arjun
Sutskever Ilya
Publication venue
Publication date: 07/09/2017
Field of study

Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers. In this paper, we identify a new class of attacks that leverage deep learning language models (Recurrent Neural Networks or RNNs) to automate the generation of fake online reviews for products and services. Not only are these attacks cheap and therefore more scalable, but they can control rate of content output to eliminate the signature burstiness that makes crowdsourced campaigns easy to detect. Using Yelp reviews as an example platform, we show how a two phased review generation and customization attack can produce reviews that are indistinguishable by state-of-the-art statistical detectors. We conduct a survey-based user study to show these reviews not only evade human detection, but also score high on "usefulness" metrics by users. Finally, we develop novel automated defenses against these attacks, by leveraging the lossy transformation introduced by the RNN training and generation cycle. We consider countermeasures against our mechanisms, show that they produce unattractive cost-benefit tradeoffs for attackers, and that they can be further curtailed by simple constraints imposed by online service providers

arXiv.org e-Print Archive

Crossref

Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets

Author: Bhuvana Ramabhadran
Brian Kingsbury
Ebru Arisoy
Tara N. Sainath
Vikas Sindhwani
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

While Deep Neural Networks (DNNs) have achieved tremen-dous success for large vocabulary continuous speech recognition (LVCSR) tasks, training of these networks is slow. One reason is that DNNs are trained with a large number of training parameters (i.e., 10-50 million). Because networks are trained with a large number of output targets to achieve good performance, the majority of these parameters are in the final weight layer. In this paper, we propose a low-rank matrix factorization of the final weight layer. We apply this low-rank technique to DNNs for both acoustic modeling and lan-guage modeling. We show on three different LVCSR tasks ranging between 50-400 hrs, that a low-rank factorization reduces the num-ber of parameters of the network by 30-50%. This results in roughly an equivalent reduction in training time, without a significant loss in final recognition accuracy, compared to a full-rank representation. Index Terms—Deep Neural Networks, Speech Recognition 1

CiteSeerX

Crossref

Unlimited vocabulary speech recognition for agglutinative languages

Author: Antti Puurula
Ebru Arisoy
Janne Pylkkönen
Mikko Kurimo
Murat Saraclar
Tanel Alumäe
Teemu Hirsimäki
Vesa Siivola
Publication venue
Publication date: 01/01/2006
Field of study

It is practically impossible to build a word-based lexicon for speech recognition in agglutinative languages that would cover all the relevant words. The problem is that words are generally built by concatenating several prefixes and suffixes to the word roots. Together with compounding and inflections this leads to millions of different, but still frequent word forms. Due to inflections, ambiguity and other phenomena, it is also not trivial to automatically split the words into meaningful parts. Rule-based morphological analyzers can perform this splitting, but due to the handcrafted rules, they also suffer from an out-of-vocabulary problem. In this paper we apply a recently proposed fully automatic and rather language and vocabulary independent way to build subword lexica for three different agglutinative languages. We demonstrate the language portability as well by building a successful large vocabulary speech recognizer for each language and show superior recognition performance compared to the corresponding word-based reference systems.

CiteSeerX

Crossref

Synthesis, Molecular Docking, and DFT Studies of Some New 2,5-Disubstituted Benzoxazoles as Potential Antimicrobial and Cytotoxic Agents

Author: Arisoy Mustafa
Barile Frank A.
Dennington Tak Roy
Ebru Uzunhisarcikli
Frisch M. J.
Gulcan Kuyucuklu
Ismail Celik
Kaplan Warren
Kayaalp S Oğuz
LLC Schrödinger
Meryem Erol
Taşcı Meryem
Ventola C. Lee
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref